IEICE global.ieice.org Site

Keyword Search Result

[Keyword] machine learning(172hit)

161-172hit(172hit)

Efficient Masquerade Detection Using SVM Based on Common Command Frequency in Sliding Windows
Han-Sung KIM Sung-Deok CHA

PAPER-Application Information Security

Vol:
E87-D No:11
Page(s):
2446-2452
Masqueraders who impersonate other users pose serious threat to computer security. Unfortunately, firewalls or misuse-based intrusion detection systems are generally ineffective in detecting masqueraders. Anomaly detection techniques have been proposed as a complementary approach to overcome such limitations. However, they are not accurate enough in detection, and the rate of false alarm is too high for the technique to be applied in practice. For example, recent empirical studies on masquerade detection using UNIX commands found the accuracy to be below 70%. In this research, we performed a comparative study to investigate the effectiveness of SVM (Support Vector Machine) technique using the same data set and configuration reported in the previous experiments. In order to improve accuracy of masquerade detection, we used command frequencies in sliding windows as feature sets. In addition, we chose to ignore commands commonly used by all the users and introduce the concept of voting engine. Though still imperfect, we were able to improve the accuracy of masquerade detection to 80.1% and 94.8%, whereas previous studies reported accuracy of 69.3% and 62.8% in the same configurations. This study convincingly demonstrates that SVM is useful as an anomaly detection technique and that there are several advantages SVM offers as a tool to detect masqueraders.
Sounds of Speech Based Spoken Document Categorization: A Subword Representation Method
Weidong QU Katsuhiko SHIRAI

PAPER

Vol:
E87-D No:5
Page(s):
1175-1184
In this paper, we explore a method to the problem of spoken document categorization, which is the task of automatically assigning spoken documents into a set of predetermined categories. To categorize spoken documents, subword unit representations are used as an alternative to word units generated by either keyword spotting or large vocabulary continuous speech recognition (LVCSR). An advantage of using subword acoustic unit representations to spoken document categorization is that it does not require prior knowledge about the contents of the spoken documents and addresses the out of vocabulary (OOV) problem. Moreover, this method works in reliance on the sounds of speech rather than exact orthography. The use of subword units instead of words allows approximate matching on inaccurate transcriptions, makes "sounds-like" spoken document categorization possible. We also explore the performance of our method when the training set contains both perfect and errorful phonetic transcriptions, and hope the classifiers can learn from the confusion characteristics of recognizer and pronunciation variants of words to improve the robustness of whole system. Our experiments based on both artificial and real corrupted data sets show that the proposed method is more effective and robust than the word based method.
Machine Learning via Multiresolution Approximation
Ilya BLAYVAS Ron KIMMEL

INVITED PAPER

Vol:
E86-D No:7
Page(s):
1172-1180
We consider the classification problem as a problem of approximation of a given training set. This approximation is constructed in a multiresolution framework, and organized in a tree-structure. It allows efficient training and query, both in constant time per training point. The proposed method is efficient for low-dimensional classification and regression estimation problems with large data sets.
Intelligent Email Categorization Based on Textual Information and Metadata
Jihoon YANG Venkat CHALASANI Sung-Yong PARK

PAPER-Artificial Intelligence, Cognitive Science

Vol:
E86-D No:7
Page(s):
1280-1288
A set of systematic experiments on intelligent email categorization has been conducted with different machine learning algorithms applied to different parts of data in order to achieve the most correct classification. The categorization is based on not only the body but also the header of an email message. The metadata (e.g. sender name, sender organization, etc.) provide additional information that can be exploited to improve the categorization capability. Results of experiments on real email data demonstrate the feasibility of our approach to find the best learning algorithm and the metadata to be used, which is a very significant contribution in email classification. It is also shown that categorization based only on the header information is comparable or superior to that based on all the information in a message for all the learning algorithms considered.
Improving Precision of the Subspace Information Criterion
Masashi SUGIYAMA

PAPER-Neural Networks and Bioengineering

Vol:
E86-A No:7
Page(s):
1885-1895
Evaluating the generalization performance of learning machines without using additional test samples is one of the most important issues in the machine learning community. The subspace information criterion (SIC) is one of the methods for this purpose, which is shown to be an unbiased estimator of the generalization error with finite samples. Although the mean of SIC agrees with the true generalization error even in small sample cases, the scatter of SIC can be large under some severe conditions. In this paper, we therefore investigate the causes of degrading the precision of SIC, and discuss how its precision could be improved.
Suitable Domains for Using Ordered Attribute Trees to Impute Missing Values
Oscar-Ortega LOBO Masayuki NUMAO

PAPER-Databases

Vol:
E84-D No:2
Page(s):
262-270
Using decision trees to fill the missing values in data has been shown experimentally to be useful in some domains. However, this is not the general case. In other domains, using decision trees for imputing missing attribute values does not outperform other methods. Trying to identify the reasons behind the success or failure of the various methods for filling missing values on different domains can be useful for deciding the technique to be used when learning concepts from a new domain with missing values. This paper presents a technique by which to approach to previous goal and presents the results of applying the technique on predicting the success or failure of a method that uses decision trees to fill the missing values in an ordered manner. Results are encouraging because the obtained decision tree is simple and it can even provide hints for further improvement on the use of decision trees to impute missing attribute values.
Inductive Logic Programming: From Logic of Discovery to Machine Learning
Hiroki ARIMURA Akihiro YAMAMOTO

INVITED PAPER

Vol:
E83-D No:1
Page(s):
10-18
Inductive Logic Programming (ILP) is a study of machine learning systems that use clausal theories in first-order logic as a representation language. In this paper, we survey theoretical foundations of ILP from the viewpoints of Logic of Discovery and Machine Learning, and try to unify these two views with the support of the modern theory of Logic Programming. Firstly, we define several hypothesis construction methods in ILP and give their proof-theoretic foundations by treating them as a procedure which complets incomplete proofs. Next, we discuss the design of individual learning algorithms using these hypothesis construction methods. We review known results on learning logic programs in computational learning theory, and show that these algorithms are instances of a generic learning strategy with proof completion methods.
On the Necessity of Special Mechanisms for Handling Types in Inductive Logic Programming
Yutaka SASAKI

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E82-D No:10
Page(s):
1401-1408
This paper demonstrates the necessity of special handling mechanisms for type (or sort) information when learning logic programs on the basis of background knowledge that includes type hierarchy. We have developed a novel relational learner RHB, which incorporates special operations to handle the computing of the least general generalization (lgg) of examples and the code length of logic programs with types. It is possible for previous learners, such as FOIL, GOLEM and Progol, to generate logic programs that include type information represented as is_a relations. However, this expedient has two problems: one in the computation of the code length and the other in the performance. We will illustrate that simply adding is_a relations to background knowledge as ordinary literals causes a problem in computing the code length of logic programs with is_a literals. Experimental results on artificial data show that the learning speed of FOIL exponentially slows as the number of types in the background knowledge increases. The hypotheses generated by GOLEM are about 30% less accurate than those of RHB. Furthermore, Progol is two times slower than RHB. Compared to the three learners, RHB can efficiently handle about 3000 is_a relations while still achieving a high accuracy. This indicates that type information should be specially handled when learning logic programs with types.
Learning from Expert Hypotheses and Training Examples
Shigeo KANEDA Hussein ALMUALLIM Yasuhiro AKIBA Megumi ISHII

PAPER-Artificial Intelligence and Cognitive Science

Vol:
E80-D No:12
Page(s):
1205-1214
We present a method for learning classification functions from pre-classified training examples and hypotheses written roughly by experts. The goal is to produce a classification function that has higher accuracy than either the expert's hypotheses or the classification function inductively learned from the training examples alone. The key idea in our proposed approach is to let the expert's hypotheses influence the process of learning inductively from the training examples. Experimental results are presented demonstrating the power of our approach in a variety of domains.
Learning Levels in Intelligent Tutoring Systems
Vadim L. STEFANUK

PAPER-Methodologies

Vol:
E78-D No:9
Page(s):
1103-1107
Intelligent Tutoring Systems (ITS) represents a wide class of computer based tutoring systems, designed with an extensive use of the technology of modern Artificial Intelligence. Successful applications of various expert systems and other knowledge based systems of AI gave rise to a new wave of interests to ITS. Yet, many authors conclude that practically valuable achievements of ITS are rather modest despite the relatively long history of attempts to use knowledge based systems for tutoring. It is advocated in this paper that some basic obstacles for designing really successful ITS are due to the lack of well understood and sound models of the education process. The paper proposes to overcome these problems by borrowing the required models from AI and adjacent fields. In particular, the concept of Learning Levels from AI might be very useful both for giving a valuable retrospective analysis of computer based tutoring and for suggestion of some perspective directions in the field of ITS.
An Approach to Concept Formation Based on Formal Concept Analysis
Tu Bao HO

PAPER-Machine Learning and Its Applications

Vol:
E78-D No:5
Page(s):
553-559
Computational approaches to concept formation often share a top-down, incremental, hill-climbing classification, and differ from each other in the concept representation and quality criteria. Each of them captures part of the rich variety of conceptual knowledge and many are well suited only when the object-attribute distribution is not sparse. Formal concept analysis is a set-theoretic model that mathematically formulates the human understanding of concepts, and investigates the algebraic structure, Galois lattice, of possible concepts in a given domain. Adopting the idea of representing concepts by mutual closed sets of objects and attributes as well as the Galois lattice structure for concepts from formal concept analysis, we propose an approach to concept formation and develop OSHAM, a method that forms concept hierarchies with high utility score, clear semantics and effective even with sparse object-attribute distributions. In this paper we describe OSHAM, and in an attempt to show its performance we present experimental studies on a number of data sets from the machine learning literature.
A Stochastic Parallel Algorithm for Supervised Learning in Neural Networks
Abhijit S. PANDYA Kutalapatata P. VENUGOPAL

PAPER-Learning

Vol:
E77-D No:4
Page(s):
376-384
The Alopex algorithm is presented as a universal learning algorithm for neural networks. Alopex is a stochastic parallel process which has been previously applied in the theory of perception. It has also been applied to several nonlinear optimization problems such as the Travelling Salesman Problem. It estimates the weight changes by using only a scalar cost function which is measure of global performance. In this paper we describe the use of Alopex algorithm for solving nonlinear learning tasks by multilayer feed-forward networks. Alopex has several advantages such as, ability to escape from local minima, rapid algorithmic computation based on a scalar cost function and synchronous updation of weights. We present the results of computer simulations for several tasks, such as learning of parity, encoder problems and the MONK's problems. The learning performance as well as the generalization capacity of the Alopex algorithm are compared with those of the backpropagation procedure, and it is shown that the Alopex has specific advantages over backpropagation. An important advantage of the Alopex algorithm is its ability to extract information from noisy data. We investigate the efficacy of the algorithm for faster convergence by considering different error functions. We show that an information theoretic error measure shows better convergence characteristics. The algorithm has also been applied to more complex practical problems such as undersea target recognition from sonar returns and adaptive control of dynamical systems and the results are discussed.